Learn R Programming

penalized (version 0.9-6)

Cross-validation in penalized generalized linear models: Penalized regression

Description

Cross-validating generalized linear models with L1 (lasso) and/or L2 (ridge) penalties, using likelihood cross-validation.

Usage

cvl (response, penalized, unpenalized, lambda1 = 0, lambda2 = 0, 
    data, model = c("cox", "logistic", "linear"), startbeta, 
    startgamma, fold, epsilon = 1e-10, maxiter, standardize = FALSE, 
    trace = TRUE)

optL1 (response, penalized, unpenalized, minlambda1, maxlambda1, 
    lambda2 = 0, data, model = c("cox", "logistic", "linear"), 
    startbeta, startgamma, fold, epsilon = 1e-10, maxiter, 
    standardize = FALSE, trace = TRUE, tol = .Machine$double.eps^0.25)

optL2 (response, penalized, unpenalized, lambda1 = 0, minlambda2, 
    maxlambda2, data, model = c("cox", "logistic", "linear"), 
    startbeta, startgamma, fold, epsilon = 1e-10, maxiter, 
    standardize = FALSE, trace = TRUE, tol = .Machine$double.eps^0.25)

profL1 (response, penalized, unpenalized, minlambda1, maxlambda1, 
    lambda2 = 0, data, model = c("cox", "logistic", "linear"), startbeta, 
    startgamma, fold, epsilon = 1e-10, maxiter, standardize = FALSE, 
    trace = TRUE, steps = 100, autominsteps = steps/5, log = FALSE)
  
profL2 (response, penalized, unpenalized, lambda1 = 0, minlambda2, 
    maxlambda2, data, model = c("cox", "logistic", "linear"), startbeta, 
    startgamma, fold, epsilon = 1e-10, maxiter, standardize = FALSE, 
    trace = TRUE, steps = 100, autominsteps = steps/5, log = TRUE)

Arguments

response
The response variable (vector). This should be a numeric vector for linear regression, a Surv object for Cox regression and a vector of 0/1 values for logistic regression.
penalized
The penalized covariates. These may be specified either as a matrix or as a (one-sided) formula object. See also under data.
unpenalized
Additional unpenalized covariates. Specified as under penalized. Note that an unpenalized intercept is included in the model by default (except in the cox model). This can be suppressed by specifying unpenalized = ~0.
lambda1, lambda2
The fixed values of the tuning parameters for L1 and L2 penalization. Both may be vectors if different covariates are to be penalized differently.
minlambda1, minlambda2, maxlambda1, maxlambda2
The values of the tuning parameters for L1 or L2 penalization between which the cross-validated likelihood is to be profiled or optimized.
data
A data.frame used to evaluate response, and the terms of penalized or unpenalized when these have been specified as a formula object.
model
The model to be used. If missing, the model will be guessed from the response input.
startbeta
Starting values for the regression coefficients of the penalized covariates. These starting values will be used only for the first values of lambda1 and lambda2.
startgamma
Starting values for the regression coefficients of the unpenalized covariates. These starting values will be used only for the first values of lambda1 and lambda2.
fold
The fold for cross-validation. May be supplied as a single number (between 2 and n) giving the number of folds, or, alternatively, as a length n vector with values in 1:fold, specifying exactly which subjects are assigned to w
epsilon
The convergence criterion. As in glm. Convergence is judged separately on the likelihood and on the penalty.
maxiter
The maximum number of iterations allowed. Set by default at 25 when lambda1 = 0, infinite otherwise.
standardize
If TRUE, standardizes all penalized covariates to unit central L2-norm before applying penalization.
trace
If TRUE, prints progress information. Note that setting trace=TRUE may slow down the algorithm (but it often feels quicker)
steps
The maximum number of steps between minlambda1 and maxlambda1 or minlambda2 and maxlambda2 at which the cross-validated likelihood is to be calculated.
autominsteps
The minimum number of steps between minlambda1 and maxlambda1 or minlambda2 and maxlambda2 at which the cross-validated likelihood is to be calculated. If autominsteps is smaller than
log
If FALSE, the steps between minlambda1 and maxlambda1 or minlambda2 and maxlambda2 are equidistant on a linear scale, if TRUE on a logaritmic scale. Please note the different
tol
The tolerance of the Brent algorithm used for minimization. See also optimize.

Value

  • A named list. See details.

Details

All five functions return a list with the following named elements: [object Object],[object Object],[object Object],[object Object],[object Object]

See Also

penalized, penfit, plotpath.

Examples

Run this code
data(nki70)

# Finding an optimal crossvalidated likelihood
attach(nki70)
opt <- optL1(Surv(time, event), penalized = nki70[,8:77], fold = 10)
coefficients(opt$fullfit)
plot(opt$predictions)

# Plotting the profile of the crossvalidated likelihood
prof <- profL1(Surv(time, event), penalized = nki70[,8:77], 
    fold = opt$fold, steps=20)
plot(prof$lambda, prof$cvl, type="l")
plotpath(prof$fullfit)

Run the code above in your browser using DataLab